Dimensionality Reduction in Genomics and Proteomics

نویسندگان

  • Milos Hauskrecht
  • Richard Pelikan
  • Michal Valko
  • James Lyons
چکیده

Finding reliable, meaningful patterns in data with high numbers of attributes can be extremely difficult. Feature selection helps us to decide what attributes or combination of attributes are most important for finding these patterns. In this chapter, we study feature selection methods for building classification models from high-throughput genomic (microarray) and proteomic (mass spectrometry) data sets. Thousands of feature candidates must be analyzed, compared and combined in such data sets. We describe the basics of four different approaches used for feature selection and illustrate their effects on an MS cancer proteomic data set. The closing discussion provides assistance in performing an analysis in high-dimensional genomic and proteomic data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of a supervised multivariate statistical algorithm for enhanced interpretability of multiblock analysis

In modern biological research, OMICs techniques, such as genomics, proteomics or metabolomics, are often employed to gain deep insights into metabolic regulations and biochemical perturbations in response to a specific research question. To gain complementary biologically relevant information, multiOMICs, i.e., several different OMICs measurements on the same specimen, is becoming increasingly ...

متن کامل

لیپیدومیکس: ازابزارهای مورد نیاز تا کاربرد درمطالعات سلامت

Omics, referred to a part of biological science that evaluates information, systematically and broadly. Although initially genomics and proteomics have been focused, but along on advances in analytical instruments, potential capabilities of subfields such as Lipidomics recognized, increasingly. Lipidomics studies have been largely able to change the past limited viewpoint to lipids as basic com...

متن کامل

The 13th International Conference on Neural Information Processing

High-throughput genomics and proteomics data have been a major source of information in the current systems biology investigations. Machine learning methods like support vector machines (SVMs), neural networks, dimension reduction etc. have been playing an active role the analysis and mining of these data, composing one of the major efforts in current bioinformatics research. A typical scenario...

متن کامل

2D Dimensionality Reduction Methods without Loss

In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (...

متن کامل

Precision Medicine: A New Revolution in Healthcare System

Every human being is different based on genetics, lifestyle, and environmental factors. Novel medical technologies have become more precise owing to molecular information, including genomics, transcriptomics, proteomics, metabolomics, etc. The “omics” technologies have opened up new horizons for healthcare systems, enabling them to prevent and/or diagnose diseases more precisel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011